Monolingual alignment with moves for genetic criticism

نویسنده

  • Julien Bourdaillet
چکیده

RÉSUMÉ. Cet article présente la problématique de l’alignement monolingue avec recherche de déplacements. Celle-ci est posée par la critique génétique, une discipline d’études littéraires. Les solutions informatiques existantes ne sont pas satisfaisantes pour répondre à ce problème NP-difficile. Nous proposons d’emprunter à la bioinformatique et l’algorithmique textuelle une famille d’algorithmes appelée alignement par fragments. Une adaptation de ce type d’algorithmes pour le TAL est décrite. Notre méthode permet d’aligner deux textes en recherchant les déplacements, au caractère près, en passant à l’échelle, et pour n’importe quelle langue alphabétique. Une évaluation expérimentale présente les bons résultats obtenus face à d’autres méthodes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Practical block sequence alignment with moves

In this paper we study a sequence alignment problem motivated by textual genetic criticism, a humanities discipline where the notion of edit distance with moves has been rediscovered by philologists. We present a formulation of the problem and show that the usual notion of edit distance with moves does not address it correctly because it is harder. We present a heuristic algorithm for this prob...

متن کامل

Alignment of noisy unstructured text data

This paper describes a textual aligner named MEDITE whose specificity is the detection of moves. It was developed to solve a problem from textual genetic criticism, a humanities discipline that compares different versions of authors’ texts in order to highlight invariants and differences between them. Our aligner handles this task and it is general enough to handle others. The algorithm, based ...

متن کامل

MEDITE: A Unilingual Textual Aligner

This paper addresses a problem of natural language text alignment, from a humanities discipline called textual genetic criticism where different text versions must be compared. The paper shows that this task is hard because such versions can be very different and texts with a lot of internal repetitions present specific difficulties. MEDITE is a natural language text aligner that compares texts...

متن کامل

Genetic Criticism and Analysis of Francis Bacon’s Painting, “Three Studies on Figures at the Base of a Crucifixion”

Tُhe process of the emergence and formation of a work of art has always been the concern of artists and researchers in art studies. The creation of artwork has also been the subject of educational programs due to its technical aspects. This underscores the importance of studying the path of the creation of a work. This is possible through the genetic criticism and its methodological feasibility....

متن کامل

Inducing Bilingual Lexicons from Small Quantities of Sentence-Aligned Phonemic Transcriptions

We investigate induction of a bilingual lexicon from a corpus of phonemic transcriptions that have been sentence-aligned with English translations. We evaluate existing models that have been used for this purpose, and report two additional models which demonstrate performance improvements. The first performs monolingual segmentation followed by alignment, while the second performs both tasks jo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • TAL

دوره 50  شماره 

صفحات  -

تاریخ انتشار 2009